Multi-class Imbalanced Data-Sets with Linguistic Fuzzy Rule Based Classification Systems Based on Pairwise Learning
نویسندگان
چکیده
In a classification task, the imbalance class problem is present when the data-set has a very different distribution of examples among their classes. The main handicap of this type of problem is that standard learning algorithms consider a balanced training set and this supposes a bias towards the majority classes. In order to provide a correct identification of the different classes of the problem, we propose a methodology based on two steps: first we will use the one-vs-one binarization technique for decomposing the original data-set into binary classification problems. Then, whenever each one of these binary subproblems is imbalanced, we will apply an oversampling step, using the SMOTE algorithm, in order to rebalance the data before the pairwise learning process. For our experimental study we take as basis algorithm a linguistic Fuzzy Rule Based Classification System, and we aim to show not only the improvement in performance achieved with our methodology against the basic approach, but also to show the good synergy of the pairwise learning proposal with the selected oversampling technique.
منابع مشابه
On Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملA hierarchical genetic fuzzy system based on genetic programming for addressing classification with highly imbalanced and borderline data-sets
Lots of real world applications appear to be a matter of classification with imbalanced data-sets. This problem arises when the number of instances from one class is quite different to the number of instances from the other class. Traditionally, classification algorithms are unable to correctly deal with this issue as they are biased towards the majority class. Therefore, algorithms tend to mis...
متن کاملSolving multi-class problems with linguistic fuzzy rule based classification systems based on pairwise learning and preference relations
This paper deals with multi-class classification for linguistic fuzzy rule based classification systems. The idea is to decompose the original data-set into binary classification problems using the pairwise learning approach (confronting all pair of classes), and to obtain an independent fuzzy system for each one of them. Along the inference process, each fuzzy rule based classification system ...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کامل